Semi-Supervised Word Sense Disambiguation for Mixed-Initiative Conversational Spoken Language Translation
نویسندگان
چکیده
Lexical ambiguity can cause critical failure in conversational spoken language translation (CSLT) systems due to the wrong sense being presented in the target language. In this paper, we present a framework for improving translation of ambiguous source words that (a) constrains statistical machine translation (SMT) decoding with phrase pair clusters to select a desired sense for translation; (b) automatically predicts the intended sense of an ambiguous source word given its context; and (c) combines the above to define a set of interactive strategies to confirm the intended sense of an ambiguous word and guide the system to the correct translation. The novel use of this framework in a realworld CSLT system distinguishes our approach from the existing work focusing on word sense disambiguation (WSD) for non-interactive, batch-mode SMT. In addition to reporting metrics that evaluate this approach in an interactive spoken language translation system, we also present offline assessments of the component technologies, viz. constrained SMT decoding with sense-specific phrase pair clusters, and automated word sense prediction.
منابع مشابه
Lightly-Supervised Word Sense Translation Error Detection for an Interactive Conversational Spoken Language Translation System
Lexical ambiguity can lead to concept transfer failure in conversational spoken language translation (CSLT) systems. This paper presents a novel, classificationbased approach to accurately detecting word sense translation errors (WSTEs) of ambiguous source words. The approach requires minimal human annotation effort, and can be easily scaled to new language pairs and domains, with only a wordal...
متن کاملA Review Of Literature On Word Sense Disambiguation
lexical ambiguity is a fundamental characteristic of language. Words can have more than one distinct meaning. Word sense disambiguation is defined as the problem of computationally determining which”sense”of a word is correct in given context. Word sense disambiguation is a task of classification where word senses are the classes, the context provides the evidence, and each occurrence of a word...
متن کاملUnsupervised Translation Disambiguation for Cross-Domain Statistical Machine Translation
Most attempts at integrating word sense disambiguation with statistical machine translation have focused on supervised disambiguation approaches. These approaches are of limited use when the distribution of the test data differs strongly from that of the training data; however, word sense errors tend to be especially common under these conditions. In this paper we present different approaches t...
متن کاملReview: Semi-Supervised Learning Methods for Word Sense Disambiguation
Word sense disambiguation (WSD) is an open problem of natural language processing, which governs the process of identifying the appropriate sense of a word in a sentence, when the word has multiple meanings. Many approaches have been proposed to solve the problem, of which supervised learning approaches are the most successful. However supervised machine learning are limited by the difficulties...
متن کاملA Review on Word Sense Disambiguation
Word sense disambiguation (WSD) is described as the job of searching the sense of a word in a situation. WSD is a core problem in many tasks related to language processing. It is aggravated by make use of in several critical utilization like Part-of-Speech tagging, Machine Translation, Information retrieval, etc. Different topics such as ambiguity, evaluation, scalability and diversity cause ch...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013